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^ (57) Abstract A symmetric key cryptographic method is provided for short operations. The method includes batching a plurality of 
operation parameters (1503), and performing an operation according to a corresponding operation parameter (1505)- The symmetric 
Q key cryptographic method is a Data Encryption Standard (DES) method. The short operations can be less than about 80 bytes. The 
^ short operations can be between 8 and 80 bytes. The method includes reading the batched parameters from a dynamic random access 
^ memory (1504), and transmitting each operation through a DES engine according to the operations parameter (1505). 
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IMPROVING DES HARDWARE THROUGHPUT FOR SHORT OPERATIONS 

This a non-provisional application claiming the benefit of provisional application 
serial No. 60/201,002, filed May 1, 2000. 

5 Xgcfraigal "Fiftlri 

The present invention relates to cryptographic support, and more particularly to 
cryptographic support for short operations. 
ftjickgroundArt 

Data Encryption Standard (DES) is a widely-used method of data encryption using 
10 private keys. There are 72 quadrillion or more possible encryption keys under the DES that 
can be used for protecting packets between parties over electronic networks. For each packet 
or message, a key is chosen at random. Like other symmetric key cryptographic methods, 
both the sender and receiver need to know and use the same private key. 

DES applies a 56-bit key to each 64-bit block of data. The process can run several 
15 modes and includes 16 rounds of operations. Although this is considered strong encryption, 
many companies use triple-DES (TDES), which applies three keys in succession to each 
packet. 

DES originated at IBM in 1977 and was adopted by the U.S. Department of Defense. 
It is specified in the ANSI X3.92 and X3. 106 standards and in the Federal Information 
20 Processing Standards (FIPS) 46 and 81 standards. 

Typically, cryptographic methods focus on large packets (greater than about 80 
bytes). However, when a DES system is used for smaller packets, the performance may drop 
by an order of magnitude. 
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Therefore a need exists for a system and method of cryptographic support for DES 
operations which has high throughput for long (>80 bytes) and shorter packets. 
Disclosure pf the Irry^tiQii 

According to an embodiment of the present invention, a symmetric key cryptographic 
5 method is provided for short operations- The method includes batching a plurality of 

operation parameters, and performing an operation according to a corresponding operation 
parameter. The symmetric key cryptographic method is a Data Encryption Standard (DES) 
method. The short operations can be less than about 80 bytes. The short operations can be 
between 8 and 80 bytes. 

10 The method includes batching the plurality of operation parameters and a plurality of 

DBS operation into a single request, calling DES for each operation in the request, and 
performing DES for each operation separately according to the corresponding operation 



j parameter. 



The method further includes batching the plurality of operation parameters and a 
15 plurality of DES operations into a single request, calling DES for the batched operations, and 
performing DES for each operation separately according to the corresponding operation 
parameter. Each request is performed with a chip reset, a key and an initialization vector. 
Calling the DES for the batched operations further comprises switching a context for the 
batched operations. The context switch is between an application layer and a system software 
20 layer. 

The method includes reading the batched parameters from a dynamic random access 
memory, and transmitting each operation through a DES engine according to the operations 
parameter. 
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According to an embodiment of the present invention, a method is provided for 
improved DES short operation throughput. The method includes bat ching a plurality of 
operation parameters, each operation parameter corresponding to an operation, reading the 
batched operation parameters into a dynamic random access memory, and transmitting each 
5 operation through a DES engine according to the operations parameter. The DES is 

external-to-external and an output for each operation is transmitted separately. The short 
operation can be less than about 80 bytes. The short operation can be between 8 and 80 
bytes. 

According to an embodiment of the present invention, a symmetric key cryptographic 
10 method is provided for operations between about 8 and about 80 bytes in length. The method 
includes providing a key index to an engine, and pumping the operations through the engine 
in bulk wherein a central processing unit does not handle the bytes. The engine is a DES 
engine. 

The method includes resetting an engine chip for an operation, reading an 
15 initializ a tion vector, and loading the initialization vector into the engine chip. The method 
further includes determining a key from the key index, loading the key into the engine chip, 
and reading a data length for the operation. 

The method includes transmitting the data length through an Input channel into the 
engine chip, and transmitting the data length through an Output channel. The channels are 
20 FIFOs. 

Brief Description of Drawings 

Preferred embodiments of the present invention will be described below in more 
detail, with reference to the accompanying drawings: 
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Fig. 1 is a diagram of the DES architecture according to an embodiment of the present 
invention; 

Fig. 2 is another diagram of the DES architecture according to. an embodiment of the 
present invention; 

5 Fig. 3 is still another diagram of the DES architecture according to an embodiment of 

the present invention; 

Fig. 4, is yet another diagram of the DES architecture according to an embodiment of 
the present invention; 

Fig. 5 is a diagram of the FIFO structure supporting DES/TDES with a coprocessor 
10 according to an embodiment of the present invention; 

Fig. 6 is another diagram of the FIFO structure supporting DES/TDES with a 
coprocessor according to an embodiment of the present invention; 

Fig. 7 is still another diagram of the FIFO structure supporting DES/TDES with a 
coprocessor according to an embodiment of the present invention; 
15 Fig. 8 is yet another diagram of the FIFO structure supporting DES/TDE& with a 

coprocessor according to an embodiment of the present invention; 

Fig. 9 is a further diagram of the FIFO structure supporting DES/TDBS with a 
coprocessor according to an embodiment of the present invention; 

Fig. 10 is a diagram of the FIFO structure supporting DES/TDES with a coprocessor 
20 according to an embodiment of the present invention; 

Fig. 11 is a flow diagram of an application handling two operations as separate 
sccRequests according to the prior art; 

Fig. 12 is a flow diagram illustrating a batched host-card interaction according to an 
embodiment of the present invention; 

4 
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Fig. 13 is a flow diagram of multiple operations batched into a single call according 
to an embodiment of the present invention; 

Fig. 14 is a flow diagram of a method which reduces data transfers for each operation 
according to an embodiment of the present invention; 
5 Fig. 15 is a flow diagram of a method which batches parameters for all operations 

into a block according to an embodiment of the present invention; and 

Fig. 16 is a graph illustrating DES speeds for various embodiments of the present 
• invention. 

Best Mode for Carrying Out the Invention 

10 The present invention provides a system and method for cryptographic support which 

has high throughput for long and short DES operations. According to an embodiment of the 
present invention, the system includes a multi-chip embedded module, packaged in a : 
Peripheral Component Interconnect (PCI) card In .addition to cryptographic hardware and 
circuitry for tamper detection and response, a general-purpose computing environment is 

1 5 provided including a central processing unit, and executing software stored in ROM and/or 
Flash memory. 

Referring to Fig. 1, the multiple-layer software architecture of the client 101 and the 
host 105 is shown. The client-side includes foundational security control in Layers 0 and 1 
102, a supervisor-level software system in Layer 2 103, and a user-level software application 
20 inLayer 3 104. Layer 2 103 supports application development Within Layer 2 103, a kernel 
provides the operating system abstractions of multiple processes and address spaces; these 
abstractions support independent managers, which handle cryptographic hardware and other 
input/output (I/O) on the bottom, and provide higher-level application program interfaces 
(APIs) to the Layer 3 application 104. An API is the specific method prescribed by a 

5 
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computer or by another program by which a programmer writing an application program can 
make requests of the operating system or another application. Typically, the Layer 3 
application 104 in turn provides an abstraction of its own API to a host>side application 107, 
The host-side 105 includes a device driver 106 and a host application 107. According 

5 to Fig. 2, for the Layer 3 application 104 to use a service provided by the card-side 

application, the host-side application 107 issues a call to the host-side device driver 106. The 
device driver 105 opens an sccRequest 108 to the Layer 2 system 103 on the device. 
Layer 2 103 informs the Layer 3 application 104 resident on the device of the existence of 
the request, and the parameters the host sent along with the request. 

10 According to Figs. 3 and 4, the Layer 3 application 104 handles the host application's 

request for service, for example, it can direct Layer 2 103 to transfer data 109 to the device 
driver 106 and perform the needed cryptographic operations. The Layer 3 application 104 
closes out the sccRequest 110 and sends the output back 111 to the host application 107. 
According to an embodiment of the present invention, a device for fast cryptography 

15 is provided. The device includes a coprocessor having a central processing unit (CPU), at 
least two levels of internal software and at least three data paths. The software levels can 
include an operation system or kernal level and an application level. The data paths can 
include an external to internal memory and/or CPU path, an internal memory and/or CPU to 
a symmetric engine path, and a channel between the external system and the symmetric 

20 engine. The channel can be a first-in first-out (FIFO). According to an embodiment of the 
present invention, the device includes a FIFO state machine. The FIFO state machine 
structure transports or drives data into and out of the method engine. 

It should be noted that 'while the present invention is presented in terms of a 
symmetric cryptographic function (e.g., DES), the invention contemplates any parameterized 

6 
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function on variable length data: Thus, DES is provided as an example of an embodiment of 
the present invention and given* the teachings of the present invention provided herein, one of 
ordinary skill in the related art will be able to contemplate these and similar implementations 
or configurations of the present invention. 
5 Referring to Fig. 5, the FIFO structure works with the DES/TDES engine 500. The 

present invention is described according to an IBM 4758 coprocessor, specifically Models 
i 002/023 PCI cryptographic coprocessors, however, given the teachings of the present 

! invention provided herein, one of ordinary skill in the related art will be able to contemplate 

I 

1 

\ these and similar implementations or configurations, . . 

i 

| 10 1° Model 2 hardware, the FIFO structure also supports fast Secure Hash Algorithm 1 

(SHA-1); though the structure may be applied to any method engine. 

For both input and output, two pairs of FIFOs 501-504, a PCI FIFO pair 501-502 and 
an internal FIFO pair 503-504 are provided for external and internal transfer, respectively, as 
i well, as a Direct Memory Access (DMA) controller 505-506 for CPU-free transfer into and 

15 out of internal dynamic random access memory (DRAM) 507. 

The internal CPU 508 selects which data paths to activate, and what key, initialization 
vector (TV), and other operational parameters the DES engine 500 may use, via control 
registers (not shown). The IV is generated by a random number generator, typically included 
in the Layer 2 system, and combined with the unencryted text and the key. The key is a 
20 variable value applied to a block of unencrypted text to produce encrypted text. 

Configurations of the DES engine 500 include bulk external-to-external DES (shown 
in Fig. 8), bulk internal-to-internal DES (output DMA 506 to internal input FIFO 503 to DES 
500, then back through the Internal Output FIFO 504 and PCI Output FIFO 502), and DMA 
transfer (e.g., PCI input FIFO 501 to internal input FIFO 503 to input DMA 505 and from the 
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